在本文中,我们讨论了用分层,细粒度标记标记不同类型的侵略和“上下文”的分层的多语言数据集的开发。这里,这里,这里由对话线程定义,其中发生特定的评论以及评论对先前注释执行的话语角色的“类型”。在此处讨论的初始数据集(并作为逗号@图标共享任务的一部分提供),包括四种语言的15,000名注释评论 - Meitei,Bangla,Hindi和印度英语 - 从各种社交媒体平台收集作为Youtube,Facebook,Twitter和电报。正如通常在社交媒体网站上,大量这些评论都是多语种的,主要是与英语混合的代码混合。本文给出了用于注释的标签的详细描述以及开发多标签的过程的过程,该方法可用于标记具有各种侵略和偏差的评论,包括性别偏见,宗教不宽容(称为标签中的公共偏见),类/种姓偏见和民族/种族偏见。我们还定义并讨论已用于标记通过评论执行的异常发挥作用的标记的标签,例如攻击,防御等。我们还对数据集的统计分析以及我们的基线实验的结果进行了发展使用DataSet开发的自动攻击识别系统。
translated by 谷歌翻译
We investigate how humans perform the task of dubbing video content from one language into another, leveraging a novel corpus of 319.57 hours of video from 54 professionally produced titles. This is the first such large-scale study we are aware of. The results challenge a number of assumptions commonly made in both qualitative literature on human dubbing and machine-learning literature on automatic dubbing, arguing for the importance of vocal naturalness and translation quality over commonly emphasized isometric (character length) and lip-sync constraints, and for a more qualified view of the importance of isochronic (timing) constraints. We also find substantial influence of the source-side audio on human dubs through channels other than the words of the translation, pointing to the need for research on ways to preserve speech characteristics, as well as semantic transfer such as emphasis/emotion, in automatic dubbing systems.
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
We propose a novel model agnostic data-driven reliability analysis framework for time-dependent reliability analysis. The proposed approach -- referred to as MAntRA -- combines interpretable machine learning, Bayesian statistics, and identifying stochastic dynamic equation to evaluate reliability of stochastically-excited dynamical systems for which the governing physics is \textit{apriori} unknown. A two-stage approach is adopted: in the first stage, an efficient variational Bayesian equation discovery algorithm is developed to determine the governing physics of an underlying stochastic differential equation (SDE) from measured output data. The developed algorithm is efficient and accounts for epistemic uncertainty due to limited and noisy data, and aleatoric uncertainty because of environmental effect and external excitation. In the second stage, the discovered SDE is solved using a stochastic integration scheme and the probability failure is computed. The efficacy of the proposed approach is illustrated on three numerical examples. The results obtained indicate the possible application of the proposed approach for reliability analysis of in-situ and heritage structures from on-site measurements.
translated by 谷歌翻译
Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion while conditioning on text prompts. We find that their synthesis behavior qualitatively changes throughout this process: Early in sampling, generation strongly relies on the text prompt to generate text-aligned content, while later, the text conditioning is almost entirely ignored. This suggests that sharing model parameters throughout the entire generation process may not be ideal. Therefore, in contrast to existing works, we propose to train an ensemble of text-to-image diffusion models specialized for different synthesis stages. To maintain training efficiency, we initially train a single model, which is then split into specialized models that are trained for the specific stages of the iterative generation process. Our ensemble of diffusion models, called eDiff-I, results in improved text alignment while maintaining the same inference computation cost and preserving high visual quality, outperforming previous large-scale text-to-image diffusion models on the standard benchmark. In addition, we train our model to exploit a variety of embeddings for conditioning, including the T5 text, CLIP text, and CLIP image embeddings. We show that these different embeddings lead to different behaviors. Notably, the CLIP image embedding allows an intuitive way of transferring the style of a reference image to the target text-to-image output. Lastly, we show a technique that enables eDiff-I's "paint-with-words" capability. A user can select the word in the input text and paint it in a canvas to control the output, which is very handy for crafting the desired image in mind. The project page is available at https://deepimagination.cc/eDiff-I/
translated by 谷歌翻译
磁共振成像(MRI)扫描很耗时且不稳定,因为患者长时间仍在狭窄的空间中。为了减少扫描时间,一些专家已经尝试了不足采样的K空间,试图使用深度学习来预测完全采样的结果。这些研究报告说,可以节省多达20到30分钟的时间,这需要一个小时或更长时间。然而,这些研究都没有探索使用掩盖图像建模(MIM)来预测MRI K空间缺失部分的可能性。这项研究利用了11161个从Facebook的FastMRI数据集中重建的MRI和膝关节MRI图像的K空间。这使用基线移位窗口(SWIN)和视觉变压器体系结构测试了现有模型的修改版本,该窗口和视觉变压器体系结构可在未采样的K空间上使用MIM来预测完整的K空间,从而预测完整的MRI图像。使用Pytorch和Numpy库进行修改,并发布到GitHub存储库。模型重建K空间图像后,应用了基本的傅立叶变换来确定实际的MRI图像。一旦模型达到稳定状态,对超参数的实验有助于实现重建图像的精确精度。通过L1丢失,梯度归一化和结构相似性值评估了该模型。该模型产生的重建图像,L1损耗值平均为<0.01,训练完成后梯度归一化值<0.1。重建的K空间对训练和验证的结构相似性值均超过99%,并通过完全采样的K空间进行验证,而验证损失在0.01以下不断减少。这些数据强烈支持算法可用于MRI重建的想法,因为它们表明该模型的重建图像与原始的,完全采样的K空间非常吻合。
translated by 谷歌翻译
扩散MRI拖拉术是一种先进的成像技术,可实现大脑白质连接的体内映射。白质拟层将拖拉机分类为簇或解剖学上有意义的区域。它可以量化和可视化全脑拖拉学。当前,大多数拟层方法都集中在深白质(DWM)上,而由于其复杂性,更少的方法解决了浅表白质(SWM)。我们提出了一种新型的两阶段深度学习的框架,即浅表白质分析(SUPWMA​​),该框架对全脑拖拉机的198个SWM簇进行了有效且一致的分析。一个基于点云的网络适应了我们的SWM分析任务,并且监督的对比度学习可以在SWM的合理流线和离群值之间进行更多的歧视性表示。我们在大规模拖拉机数据集上训练模型,包括来自标签的SWM簇和解剖学上难以置信的流线样本的简化样品,我们对六个不同年龄和健康状况的独立获取的数据集进行测试(包括新生儿和具有空间型脑肿瘤的患者) )。与几种最先进的方法相比,SupWMA在所有数据集上获得了高度一致,准确的SWM分析结果,在整个健康和疾病的寿命中都良好的概括。另外,SUPWMA​​的计算速度比其他方法快得多。
translated by 谷歌翻译
联合学习(FL)是一种新兴技术,用于协作训练全球机器学习模型,同时将数据局限于用户设备。FL实施实施的主要障碍是用户之间的非独立且相同的(非IID)数据分布,这会减慢收敛性和降低性能。为了解决这个基本问题,我们提出了一种方法(comfed),以增强客户端和服务器侧的整个培训过程。舒适的关键思想是同时利用客户端变量减少技术来促进服务器聚合和全局自适应更新技术以加速学习。我们在CIFAR-10分类任务上的实验表明,Comfed可以改善专用于非IID数据的最新算法。
translated by 谷歌翻译
在这项工作中,我们专注于生成嘈杂的,教学视频的图形表示,以供视频理解。我们提出了一种自制,可解释的方法,该方法不需要任何图形表示的注释,这将是昂贵且耗时的。我们试图通过呈现语义视频图或SVGraph来克服“黑匣子”学习限制,这是一种多模式的方法,它利用叙述来实现学习图的语义解释性。SVGraph 1)依靠多种方式之间的一致性来学习统一的图形结构,并借助跨模式的注意力和2)在语义分配的帮助下分配语义解释,从而从视频叙述中捕获语义。我们在多个数据集上执行实验,并演示语义图学习中SVGraph的解释性。
translated by 谷歌翻译
白质图微观结构已显示出影响认知表现的神经心理学评分。但是,尚未尝试从白质图数据中预测这些分数。在本文中,我们提出了一个基于深度学习的框架,用于使用从扩散磁共振成像(DMRI)片段估计的微观结构测量结果进行神经心理学评分的预测,该框架的重点是基于接受语言的关键纤维纤维小道的接受性词汇评估任务的性能弓形筋膜(AF)。我们直接利用来自纤维道中所有点的信息,而无需按照传统上沿着光纤的平均数据进行扩散MRI Tractometry方法所要求的。具体而言,我们将AF表示为点云,每个点都有微观结构测量,从而可以采用基于点的神经网络。我们通过拟议的配对 - 塞亚姆损失来改善预测性能,该损失利用了有关连续神经心理学评分之间差异的信息。最后,我们提出了一种关键区域定位(CRL)算法来定位包含对预测结果有很大贡献的点的信息解剖区域。我们的方法对来自人类Connectome项目数据集的806名受试者的数据进行了评估。结果表明,与基线方法相比,神经心理评分的预测表现优异。我们发现,AF中的关键区域在受试者之间非常一致,额叶皮质区域的强大贡献最多(即,尾部中间额叶,pars opercularis和pars triangularis)与关键区域有着强烈的影响用于语言过程。
translated by 谷歌翻译